11-928 Master’s Thesis Symmetric Probabilistic Alignment
نویسنده
چکیده
The CMU Example-Based Machine Translation (EBMT) system has been deployed successfully in many projects for years. But even though a good alignment algorithm is essential since the CMU EBMT system uses parallel corpora, it has relatively less studied than other components of EBMT. For this reason, we developed a new alignment algorithm which uses statistical information drawn from parallel corpora and heuristics based on human linguistic knowledge. Unlike most alignment approaches in Statistical Machine Translation (SMT) systems, our alignment algorithm uses only bilingual dictionaries as statistical information trained from other systems, calculates alignment scores bi-directionally and aims at aligning up to 8 words long source fragments. In our experiments so far, it outperformed the old heuristic-based alignment algorithm in both alignment accuracy and translation accuracy in EBMT. Its performance was very close to the the state-of-the-art in SMT systems for which we picked IBM Model 4 for comparison, and a combination of our new method and IBM Model 4 performed best.
منابع مشابه
Development of Polygon Reduction Algorithms for Symmetric 3D Models
Development of Polygon Reduction Algorithms for Symmetric 3D Models This Master’s thesis describes two polygon reduction algorithms suitable for symmetric 3D models. Also, a continuous symmetry measure is developed which makes it possible to, on a continuous scale, quantify the amount of symmetry an object possesses. Typically, a polygon reduction algorithm takes a 3D model as input and generat...
متن کاملAttack-tree based risk analysis of Estonian i-voting
This report analyzes two independent works published in 2014 that model security threats of Estonian i-voting scheme using attack trees. The first one, the master’s thesis of Tanel Torn [11] constructs several realistic attack trees for various types of attacks on Estonian i-voting system and evaluates them using three different state-of-the-art methodologies proposed in attack-tree literature....
متن کاملMaster’s Thesis Research Proposal
This is a proposal for the research I wish to do for my Master’s thesis. It is an attempt to categorize what I know, what I don’t know, what I need to do, and where I need help. It also consists of my attempt to completely survey the literature.
متن کاملSymmetric Probabilistic Alignment for Example-Based Translation
Since subsentential alignment is critically important to the translation quality of an Example-Based Machine Translation (EBMT) system which operates by finding and combining phrase-level matches against the training examples, we recently decided to develop a new alignment algorithm for the purpose of improving the EBMT system’s performance. Unlike most algorithms in the literature, this new Sy...
متن کاملMicrowavave Tomography for Breast Cancer Detection Master’s thesis in Master’s of Biomedical Engineering
...................................................................................................................vii Acknowledgement ................................................................................................. viii
متن کامل